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Section  I:  Project  Summary 
1 .  Overview  of  Pr oj  ect 

This  project  is  performed  under  the  Office  of  Naval  Research  program  on  Basic  and  Applied  Research  in 
Sea-Based  Aviation  (ONR  BAA12-SN-0028).  This  project  addresses  the  Sea  Based  Aviation  (SBA) 
virtual  dynamic  interface  (VDI)  research  topic  area  “Fast,  high-fidelity  physics-based  simulation  of 
coupled  aerodynamics  of  moving  ship  and  maneuvering  rotorcraft”.  The  work  is  a  collaborative  effort 
between  Penn  State,  NAVAIR,  and  Combustion  Research  and  Flow  Technology  (CRAFT  Tech).  This 
document  presents  progress  at  Penn  State  University. 

All  software  supporting  piloted  simulations  must  run  at  real  time  speeds  or  faster.  This  requirement 
drives  the  number  of  equations  that  can  be  solved  and  in  turn  the  fidelity  of  supporting  physics  based 
models.  For  real-time  aircraft  simulations,  all  aerodynamic  related  information  for  both  the  aircraft  and 
the  environment  are  incorporated  into  the  simulation  by  way  of  lookup  tables.  This  approach  decouples 
the  aerodynamics  of  the  aircraft  from  the  rest  of  its  external  environment.  For  example,  ship  airwake  are 
calculated  using  CFD  solutions  without  the  presence  of  the  helicopter  main  rotor.  The  gusts  from  the 
turbulent  ship  airwake  are  then  re -played  into  the  aircraft  aerodynamic  model  via  look-up  tables.  For  up 
and  away  simulations,  this  approach  works  well.  However,  when  an  aircraft  is  flying  very  close  to 
another  body  (i.e.  a  ship  superstructure)  significant  aerodynamic  coupling  can  exist.  The  main  rotor  of 
the  helicopter  distorts  the  flow  around  the  ship  possibly  resulting  significant  differences  in  the 
disturbance  on  the  helicopter.  In  such  cases  it  is  necessary  to  perform  simultaneous  calculations  of  both 
the  Navier-Stokes  equations  and  the  aircraft  equations  of  motion  in  order  to  achieve  a  high  level  of 
fidelity.  This  project  will  explore  novel  numerical  modeling  and  computer  hardware  approaches  with 
the  goal  of  real  time,  fully  coupled  CFD  for  virtual  dynamic  interface  modeling  &  simulation. 

Penn  State  is  supporting  the  project  through  integration  of  their  GENHEL-PSU  simulation  model  of  a 
utility  helicopter  with  CRAFT  Tech’s  flow  solvers.  Penn  State  will  provide  their  piloted  simulation 
facility  (the  VLRCOE  rotorcraft  simulator)  for  preliminary  demonstrations  of  pilot-in-the-loop 
simulations.  Finally,  Penn  State  will  provide  support  for  a  final  demonstration  of  the  methods  on  the 
NAVAIR  Manned  Flight  Simulator. 

Activities  this  period 

During  this  report  period,  we  implemented  the  CRAFT  CFD  code  on  the  Penn  State  VLRCROE  Flight 
simulator  and  performed  the  first  Pilot-in-the-Loop  PILCFD  tests  at  Penn  State  using  the  COCOAS 
clusters.  The  initial  tests  were  performed  with  1.2  million  grid  cells  using  640  processors.  The  tests 
verified  that  the  network  configuration  works  and  demonstrated  the  integration  of  the  flight  simulator 
and  Penn  State  computing  infrastructure.  Initial  tests  showed  slower  performance  than  real-time  (3x 
slower  than  real-time).  In  order  to  investigate  our  system  performance  and  to  figure  out  drawbacks  of 
using  relatively  coarse  computational  domains  to  reach  real-time  speeds,  additional  fully  coupled 
simulations  were  performed.  The  results  showed  us  the  sensitivity  of  the  dynamic  response  of  the 
helicopter  to  coarseness  of  the  mesh  used. 

PILCFD  Simulations  at  Penn  State  VLCROE  Flight  Simulator 

The  initial  PILCFD  efforts  were  performed  early  in  2016  at  CRAFT  Tech’s  facility  and  results  were 
presented  at  AHS  Forum  72  (Ref  I)  All  these  efforts  performed  on  CRAFT  Tech’s  in-house  cluster  with 
32  nodes  each  containing  8  Intel  Xeon  E5530  Processors  (2.4  GHz),  using  a  40Gpbs  Inifiniband 
interconnect  (32  nodes  x  8  procs  =256  processors).  As  a  flight  platform,  a  workstation  running  XPlane 
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with  a  simple  joystick  was  used  (Figure  1).  At  that  time,  we  were  able  to  demonstrate  the  first  near-real¬ 
time  (3x  slower)  PILCFD  test  for  a  simplified  shedding  wake  using  a  300K  grid  cells  and  inviscid  flow 
assumption. 


Figure  1  -  Initial  near-real-time  PILCFD  test  was  performed  at  CRAFT  Tech  in  early  2016. 

During  this  reporting  period,  we  implemented  the  CRAFT  CFD  code  on  the  Penn  State  computing 
infrastructure  (COCOA5)  and  integrated  the  COCOA5  cluster  with  the  VLRCOE  Flight  simulator 
(Figure  2).  The  Penn  State  VLCROE  Elight  simulator  cab  is  from  the  historic  XV- 15  tilt  rotor  aircraft.  A 
four-channel  Control  Loading  System  provides  fully  programmable  high  bandwidth  control  loading  and 
reads  the  pilot  stick  positions.  The  simulator  integrates  up  to  eight  different  computers  on  a  local 
network  to  distribute  the  computing  load,  with  separate  computers  providing  Image  Generation, 
interface  with  the  control  loading  system,  cockpit  instrument  displays,  and  the  math  model  of  the 
rotorcraft  dynamics.  The  visual  system  consists  of  a  three-channel  high-resolution  projection  system 
(WSXGA-i-  native  resolution),  a  15'  diameter  by  IT-high  cylindrical  screen,  and  image  distortion 
correction  and  blending.  This  provides  a  seamless  170°  horizontal  field-of-view.  The  X-Plane 
Professional  flight  simulation  software  is  used  for  Image  Generation,  with  three  separate  computers 
driving  each  visual  channel.  The  COCOA5  cluster  is  made  up  of  a  single  master  node  and  47 
computational  nodes  built  on  the  7*  generation  of  Proliant  Servers  from  HP.  Each  computational  node  is 
built  on  the  DL165  platform  and  uses  two  AMD  6276  “Interlagos”  16-core  processors,  32  cores  per 
node,  at  2.3  GHz  (1,504  cores  in  total)(introduced  in  Nov  2011).  The  network  communication  between 
nodes  is  established  using  a  low  latency  20  Gb/s  Infiniband  fabric. 


Figure  2  -  First  PILCFD  test  at  Penn  State  VLRCOE  flight  simulator  facility 

Towards  the  real-time  PILCFD  simulations  we  conducted  several  efforts  at  Penn  State.  The  initial  efforts 
showed  that  the  network  configuration  (Figure  3)  between  the  flight  simulator  and  the  computing  cluster 
(COCOA5)  works  well.  In  contrast  to  our  first  PILCFD  efforts  at  Craft  Tech’s  facility,  this  time  a 
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relatively  finer  mesh  with  1.2  million  grid  cells  was  used  to  resolve  the  same  simplified  shedding  wake 
using  a  viscous  flow  assumption  with  no  turbulence  model.  The  resolved  scales  of  turbulence  modeled 
using  Monotone  Integrated  Large  Eddy  Simulation  (MILES),  which  has  been  shown  to  be  adequate  for 
airwake  simulations  and  reduces  the  computational  cost  of  the  simulations  (Ref  2). 
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Figure  3  -  PILCFD  demonstration  case  network  configuration  used  for  Penn  State  Flight  Simulator 


Figure  4  shows  the  dynamic  response  of  the  simulated  helicopter  during  the  first  PILCFD  test  at  Penn 
State  VLRCOE  flight  simulator.  The  pilot  performed  a  simple  approach  case  during  the  test.  The  flight 
simulator  was  fully  coupled  with  the  CFD  simulation,  which  was  running  on  640  processors  at  COCOAS 
cluster.  The  achieved  average  execution  time  of  the  simulation  was  3  times  slower  than  real-time. 


Figure  4  -  The  changes  in  dynamic  response  of  the  helicopter  during  first  PIL-CFD  simulation  at  Penn  State 

VLRCOE  Flight  Simulator. 


Performance  Study  and  Grid  Dependency 

To  quantify  the  timing  performance  of  the  developed  coupling  tool  on  different  number  of  processors,  a 
study  was  performed  to  demonstrate  the  average  runtime  costs  achieved  from  three  different 
computational  domains.  For  this  study,  two  different  flight  maneuvers  were  chosen:  Hover  in  an  open 
domain  and  approach  to  a  simple  backward-facing  step.  For  each  case,  three  different  computational 
domains  with  different  mesh  resolutions  at  the  rotor  region  were  prepared.  For  the  hover  case,  the 
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computational  grid  sizes  were:  550k,  700k,  5.98m  while  the  mesh  resolutions  were  4ft.,  2ft.  and  1ft. 
respectively.  For  the  approach  case,  the  computational  domain  sizes  were:  330k,  1.2m,  8m,  while  the 
grid  resolutions  at  the  rotor  disk  region  were  4ft,  2ft  and  1ft  respectively. 


Figure  5  -  Average  runtime  cost  achieved  from  each  of  the  computational  grid  used  for  the  hover 
case,  running  on  different  number  of  processors. 

Figure  5  shows  the  average  runtime  achieved  from  each  of  the  computational  grid  used  for  the  hover 
case,  running  on  different  number  of  processors.  All  these  simulations  were  performed  using  a  NLDI 
controller  to  keep  the  helicopter  at  the  desired  position.  For  the  550k  and  700k  cases,  the  CFD  time  step 
was  set  to  0.01  and  for  the  5.98m  case,  the  CFD  time  step  was  set  to  0.005  seconds. 


Figure  6  -  Average  runtime  cost  achieved  from  each  of  the  computational  grid  used  for  the 
approach  case,  running  on  different  number  of  processors. 
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Figure  6  shows  the  average  runtime  achieved  from  each  of  the  computational  grid  used  for  the  approach 
case,  running  on  different  number  of  processors.  Similar  to  the  hover  case,  all  these  simulations  were 
performed  using  a  NLDI  controller  to  keep  the  helicopter  at  the  desired  flight  trajectory.  For  the  330k 
and  1.2m  cases,  the  CFD  time  step  was  set  to  0.01  sec  and  for  the  8m  case,  the  CFD  time  step  was  set  to 
0.005  seconds. 

As  it  can  be  seen  from  Figure  5  and  Figure  6,  the  execution  times  do  not  drop  in  exact  proportion  to  the 
increase  in  number  of  processors.  Parallel  efficiency  of  a  PDF  solver  depends  on  several  factors  such  as 
partitioning  algorithm,  algorithmic  scalability  and  load  balancing.  On  a  scalable  implementation,  the 
time  per  iteration  is  expected  to  reduce  in  inverse  proportion  to  the  number  of  processor  (Ref  3).  Prior 
scalability  tests  with  standalone  CRAFT  CFD  solver  showed  better  performance  than  these  results.  The 
reason  of  poor  scalability  on  the  coupled  simulations  may  be  the  earlier  coupling  interface 
implementations  on  the  CFD  solver.  For  example,  currently,  the  source  point  search  task  is  performed  in 
parallel  (to  the  solver  tasks)  on  a  single  processor.  As  the  CFD  is  partitioned  into  larger  numbers  of 
processors,  the  point  search  task  cost  will  remain  the  same,  and  may  become  a  larger  and  larger 
percentage  of  the  cost.  This  may  results  in  a  poor  load-balancing  in  the  simulation  and  a  stall  on  the 
solver  scalability  performance. 


Table  1  -  Achieved  minimum  execution  times  for  each  of  the  computational  domain. 


Case 

Number  of 
processor 
used 

CFD  time  step 
(sec) 

Achieved  minimum 
execution  time 
(sec/ite) 

Real-time 

performance 

Hover  -  550k 

704 

0.01 

0.033 

3.3x  slower 

Hover  -  700k 

640 

0.01 

0.0395 

3.9x  slower 

Hover  -  5.98m 

960 

0.005 

0.148 

29. 6x  slower 

Approach  -  330k 

256 

0.01 

0.0408 

4.08x  slower 

Approach  -  1 .2m 

604 

0.01 

0.0519 

5.19x  slower 

Approach  -  8m 

928 

0.005 

0.215 

43x  slower 

Figure  7  to  Figure  10  show  the  variations  in  the  dynamics  response  of  the  simulated  helicopter 
approaching  to  a  backwards -facing  step.  The  '‘No-coupling”  case  represents  the  standalone  GENHEL- 
PSU  simulation  without  any  airwake  disturbance.  The  remaining  simulations  represent  the  fully  coupled 
simulations  performed  with  different  computational  domains.  Results  show  that  the  fully  coupled 
simulation  using  the  4ft  mesh  resolution  at  the  area  of  interest  shows  the  least  airwake  disturbance.  The 
coarse  mesh  structure  results  in  very  high  dissipation  in  the  flow  solution  and  creates  only  minor 
disturbances  on  the  helicopter  body.  On  the  1.2m  and  8m  cases,  we  get  better  results  than  the  330k  case. 
The  airwake  intensity  is  relatively  higher  and  which  results  in  more  fluctuation  in  the  helicopter  dynamic 
responses.  The  fluctuations  are  similar  for  the  first  half  of  the  simulation,  when  the  helicopter  is  far 
behind  the  box  structure.  However,  results  are  different  when  the  helicopter  gets  close  the  box  structure, 
where  the  flow  shows  more  chaotic  behavior. 
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Figure  7  -  Variations  in  the  positions  of  the 
simulated  helicopter. 


Figure  8  -  Variations  in  the  attitndes  of  the 
simnlated  helicopter. 
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Figure  9  -  Variations  in  the  control  inputs  of  Figure  10  -  Variations  in  the  total  power 
the  simulated  helicopters.  required  of  the  simulated  helicopter. 
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Significance  of  Results 

The  CRAFT  CFD  code  was  successfully  implemented  to  the  Penn  State  VLRCROE  Flight  simulator  and 
first  Pilot-in-the-Loop  CFD  (PILCFD)  tests  were  performed  at  Penn  State  using  COCOAS  clusters.  The 
initial  tests  were  performed  with  1 .2  million  grid  cells  using  640  processors  and  showed  3  times  slower 
performance  than  real-time  on  the  COCOAS  clusters.  We  verified  that  the  network  configuration  works 
well  and  we  are  able  to  perform  PILCFD  test  using  the  actual  flight  simulator  and  Penn  State  computing 
infrastructure.  Note  that  COCOAS  uses  relatively  old  architecture,  which  is  almost  6  years  old.  We  are 
currently  building  a  new  cluster  system  at  our  department  and  our  initial  tests  with  the  new  built  cluster 
system  (COCOA6)  which  showed  almost  2x  faster  performance  than  then  COCOAS.  Additionally,  we 
investigated  the  timing  performance  of  the  CFD  solver  on  different  number  of  processors  for  two 
different  cases  and  using  three  different  mesh  resolution  of  for  each  case.  Initial  test  results  showed  the 
limitations  of  the  CFD  solver  scalability  with  the  increasing  number  of  processors  as  a  result  of  the  latest 
coupling  interface  implementations  on  the  code.  We  will  perform  further  study  to  investigate  these 
limitations. 

2.  Plans  and  upcoming  events  for  next  reporting  period 

•  We  are  performing  additional  tests  to  quantify  the  timing  performance  of  the  CFD  solver.  We  plan  to 

repeat  PILCFD  tests  using  the  actual  flight  simulator  and  different  computational  domains.  Results  will 
help  us  to  figure  out  the  potential  speed  up  gains  by  optimizing  grid  that  we  used  for  the  simulation. 

•  We  will  be  collaborating  with  Craft-TECH  to  discuss  scalability  issues  of  the  CFD  solver  and  seek 

possible  improvements.  Initial  test  results  showed  the  limitations  of  the  CFD  solver  scalability  with  the 
increasing  number  of  processors  as  a  result  of  the  latest  coupling  interface  implementations  on  the  code. 
We  will  perform  further  study  to  investigate  these  limitations. 
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